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NOTE  ON  SIMPLIFIED  ESTIMATORS  FOR  TYPE  I  EXTREME-VALUE  DISTRIBUTION 

Julius  Lieblein 


Methods  for  extreme-value  analysis  (for  the  Type  I  extreme -value 
distribution)  that  have  optimum  properties  involve  up  to  20  quantities 
(depending  on  sample  size)  whose  values  are  known  to  6  decimal  places. 
The  present  note  shows  how  to  modify  these  to  much  simpler  values  involving 
2  decimal  places  that  are  more  convenient  to  use  yet  sacrifice  very  little 
of  the  optimum  features. 

Key  words:  Simplified  estimators;  linear  unbiased  estimators;  bias;  efficiency; 
extreme  values;  Type  I  distribution;  statistics. 

1.  Introduction 

An  NBSIR  by  the  writer  [1]  described  the  occurrence  and  nature  of  the 
Type  I  extreme-value  distribution  and  presented  estimates  of  the  two  parameters 
of  this  distribution  for  various  ranges  of  sample  sizes  from  very  small  to  very 
large.    It  was  explained  that  for  any  sample  size  there  exists  a  BLUE- -best 
linear  unbiased  estimator- -with  optimum  properties.    These  estimators  are  linear 
functions  of  the  sample  order  statistics- -observations  arranged  in  ascending 
order.    The  coefficients  of  such  estimators  were  given  to  sample  size  n  =  16, 
and  are  known  to  n  =  20,  to  six  decimal  places. 

For  rapid  and  convenient  use,  it  seems  desirable  to  try  to  replace  the  more 
exact  six-decimal  coefficients  by  much  simpler  values,  with  two  or  even  one 
decimal  place  or  significant  figure.    It  is  the  purpose  of  this  note  to  show  how 
to  obtain  such  "simplified  estimators"  with  properties  almost  as  good  as  the 
more  exact  best  ones.    For  this  it  will  first  be  necessary  to  present  the 
expected  value  and  variance  of  linear  forms,  as  related  to  the  extreme-value 
distribution. 

The  linear  (order  statistics)  estimators  of  the  parameters  u,  b  are: 


u  = 


n 


i=l 


a-x. 
1  1 


(1) 


b  = 


n 

i 

i=l 


b.x. 


or 


I  u 


C  =  I 


ni 


=  C  X. 


(la) 


n 


where  C  is  the  nx2  matrix  of  coefficients  (prime  denotes  transpose) ,  x  is 

the  n- rowed  vector  of  the  n  observations,  after  arrangement  in  ascending  order 

(order  statistics)  i.e., 

x^  <  X2  5  ...  < 


Before  ordering  the  x's  are  independent  observations  from  the  Type  I 
extreme  value  distribution 


(x-u)/b 


Prob.  {X  <  x}  =  e 


-oo  <  X  <  °° 
-oo    <    U    <  °° 

0  <  b  <  °° 


(2) 


Tlie  expected  values  of  the  estimators  (la)  are  given  by: 


'         -    ^  E(c)  =  C'E(x) . 

In  order  to  proceed,  the  random  variables  Xj^  must  be  expressed  so  as  to 
exhibit  the  parameters  explicitly.    This  is  done  by  writing  each  x^  as: 


(3) 


X 


i  =  u  +  by^,    i  =  1, 


,  n. 


(4) 


where  the  y.  are  the  n  order  statistics  from  the  "reduced",  parameter- free 
distribution  (corresponding  to  the  standardized  distribution  in  the  normal - 
distribution  case) 


-y 


Prob.  {Y  <  y}  =  e 


-oo  <  y  <  00 


(5) 


Eq.  (3)  then  becomes 


E(c)  =  C  (ul  +  bE(y)) 

f 

1  Ey^ 


=  C 


1  Ey 


n 


=  C  e  c. 


(6) 
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where  1  is  the  nxl  vector  of  I's,  E(y)  is  the  nxl  vector  of  the  known  expected 
values"of  the  order  statistics,  y^;  e  is  the  nx2  matrix  with  column  1,  E(y), 
and  c  is  the  2x1  column  vector  of  the  parameters  u,b. 

For  unbiasedness ,  the  expected  values  of  the  estimators  (6)  must 
equal  the  parameters.    The  conditions  for  this  from  (6)  are: 


c, 

or 

(C'e 

-  I2)  c  =  0, 

i.e. , 

(7) 

n 

n 

E 
i=l 

a- 
1 

=  1, 

0 

(7a) 

n 

Z 
i=l 

b. 

1 

=  0, 

1 

(7b) 

The  BLUE  are  unbiased,  and  unique,  being  "best",  by  definition  and  calculation. 
Therefore  any  alteration  such  as  simplified  estimators  would  result  in  bias. 
However,  the  variance  may  be  less,  since  we  are  no  longer  restricted  to  the 
class  of  unbiased  estimators.    The  measure  of  goodness  of  the  estimator  must 
then  be  modified  to  include  the  bias;  it  becomes  the  mean  square  error 
of  the  estimator  about  the  parameter  estimated,  not  about  its  expected 
value  as  is  the  case  with  the  variance,  i.e. 

MSE(u)  =  E(u  -  u)"^  =  E[(u  -  Eu)  +  (Eu  -  u)]"^ 
=  E(u  -  Eu)^  +  (Eu  -  u)^  ' 

=  VARIANCE  (G)  +  [BIAS  (G) ] ^  (8) 

the  middle  term  on  expanding  the  square  vanishing  because  it  is  a  multiple 
(namely,  (Eu  -  u))  of: 

/\  /\  /\  ^ 

E(u  -  Eu)  =  Eu  -  Eu  =  0. 

For  unbiased  estimators,  MSE  and  variance  are  the  same. 

2.  Bias 

For  biased  estimators,  the  bias  is  given  by,  in  place  of  (7a)  and  (7b), 
(C'e  -  I2)  (see  (7)),  i.e. 

^  n  n 

bias(u)  =  (  Z    a.  -  l)u  +  (  Z    a-Ey.)  b  (9a) 
i=l    ^  i=l    ^  ^ 

^  n  n 

bias(b)  =  (  Z  ■  b.)u  +  (  Z    b.Ey.  -  1)  b  (9b) 
i=l    ^  i=l    ^  ^ 
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Tlie  MSE  is  thus,  in  general,  a  quadratic  function  of  the  two  unknoMi 
parameters  u  and  b  and  so  presents  a  difficult  situation.    To  make  it  more 
tractable  and  reach  definite  results,  we  make  adjustments,  which  will 
usually  be  small,  in  the  coefficients  of  the  simplified  estimator,  so  that 
the  parameter  u  will  not  appear  in  the  bias. 


3.      Simplified  Estimators 

a.  Construction 

The  simplified  estimators  are  summarized  in  Table  1  for  n  =  10. 
The  first  column  gives  the  coefficients  of  the  BLUE  estimators  for  u  and 
for  b.  Col.  (2)  gives  the  BLUE  coefficients  rounded  to  2  decimal  places. 
The  a's  add  to  0.99  instead  of  1.00  as  would  be  necessary  in  (9a)  for  the 
u-term  to  disappear,  so  a  slight  adjustment  is  made  that  would  least  affect 
a  coefficient — in  this  case,  a^  is  increased  by  the  minute  amount,  0.0012, 
which  permits  rounding  to  .01  more  and  so  raise  the  total  to  1.00  (Col.  (3)). 
Also,  it  turns  out  that  the  b's  add  to  0.00  so  no  adjustment  is  necessary 
there.    Another  type  of  rounding  is  to  2  significant  figures  instead  of  2 
places,  and  this  time  adjustment  is  necessary  in  both  an  a-  and  a  b- 
coefficient.  Cols.  (4)  and  (5).    The  next  4  versions.  Cols.  (6)  to  (9),  are 
formed  similarly  on  the  basis  of  1  decimal  place  and  1  significant  figure. 

b .  Variance,  MSE  and  Efficiency  Ratio 

Using  the  "propagation  of  error"  fomula  for  variance  of  a  linear 
form  (see  [2]), 

var(L'x)  =  L'V(x)L,  (10) 

where  L  and  x  are  n-rowed  column  vectors  of  coefficients  and  variables, 
respectively,  and  V(x)  is  the  nxn  matrix  of  variances  and  covariances  of  the 
x's,  we  have 

var(u)  =  a'Va  =  (a'va)  b^  (11) 
var(b)  =  b'  Vb    =  (b*  vb  )  b^ 

^  0     0        ^0  0^ 

where  by  (4) ,  '  ■ 

V(x)  =  v(y)    b^         '  (12) 

with  v(y)  the  nxn  variance -covariance  matrix  of  the  reduced  extreme  value 
order  statistics,  y. ,  and  the  arguments  x  and  y  are  suppressed  for 
convenience;  the  quantities  a  and  b    are  the  n-rowed  vectors  of  the 
coefficients  a^  and  b.  respectively?    (The  subscript  "o"  is  used  to  avoid 
confusion  with  the  parameter  b) . 
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From  (11)  and  relations  such  as  (8),  we  have: 


MSE(u) 


n 

a'va  +  (  Z  a-Ey-) 
i=l  ^ 


(13a) 


MSE(b)  = 


b'  vb  + 
o  o 


n 

(  ^ 
1=1 


b^Ey.) 


(13b) 


Calculation  of  bias,  variance,  MSE  were  carried  out  by  use  of 
OMNITAB  on  the  MBS  1108.    A  copy  of  the  program  is  attached,  and  can  be 
readily  modified  to  give  results  for  any  other  sample  sizes  where  the 
BLUE  coefficients  are  known;  at  present  they  are  known*  for  sample  sizes 
up  to  n  =  20.    They  are  shown  in  Table  2  for  n  =  10.    The  4  adjusted 
estimators  (Col.  (1))  are  those  in  Table  1,  Cols.  (3,5^7,9)  as  indicated. 
Bias  (Col.  (2))  is  in  terms  of  b  only,  as  shown,  since  the  term  in  u  has 
been  suppressed  through  the  adjustment.    Variance  and  mean  square  error, 
in  terms  of  b  ,  are  shown  in  Cols.  (3)  and  (4).    Col.  (5)  gives  the 
"efficiency  ratio",  which  shows  how  the  "efficiency"  measure  MSE  compares 
with  that  of  the  "best",  BLUE.     (A  ratio  greater  than  1  means  BLUE  is 
more  efficient,  and  vice  versa . ) 

For  example,  when  the  estimator  is  simplified  and  adjusted  to 
2  decimal  places  as  described  above  ("2D") ,  the  efficiency  is  virtually 
the  same  for  the  estimator  of  the  parameter  u,  and  only  about  1/2%  worse 
(larger  MSE)  as  shown  in  the  fourth  line  of  Column  (5)  in  Table  2.  For 
two  significant  figures  (the  third  estimator) ,  the  results  are  virtually 
the  same  as  for  two  places.    If  the  estimator  is  altered  still  more  drastically, 
to  1  figure — whether  decimal  or  significant — the  efficiency  becomes  worse, 
as  might  be  expected.    Similar  remarks  apply  to  the  amount  of  bias,  being 
virtually  mil  with  two-figure  estimators,  and  more  appreciable  with  one- 
figure  estimators. 

These  results  make  plausible  the  following  statement,  for 
sample  sizes  that  are  not  too  small,  say  6  or  more: 

Two-figure  coefficients  (whether  2  decimal  places 
or  2  significant  figures),  for  estimators  of  the 
two  parameters  of  the  Type  I  extreme -value 
distribution,  can  yield  practically  as  good 
efficiency  as  is  obtainable  by  BLUE. 


*See  reference  for  paper  by  White  [1] . 
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Table  2.    Bias  and  Efficiency  of  Simplified  Estimators  Adjusted 
so  Bias  Depends  only  on  b,  not  u,  n  =  10  (upper  line 
relates  to  u,  lower  line  to  b) 


E: 

:ficiency  Ratio 

MSE/b2 

9 

flVlSF  /h 

var(BLUE)/b^ 

(i) 

(2) 

(4) 

(5) 

BLUE 

0 

0.112973 

0.112973 

1 

0 

.071573 

.071573 

1 

Simplified  Est. 

Adj .  to  2  D 

0.000458 

.113002 

.113002 

1.000256 

(Col.  (3)}* 

.002725 

.071973 

.071980 

1.005686 

Adj.  to  1  D 

-.050827 

.113345 

.115928 

1.026157 

(Col.  (5)) 

.192893 

.102266 

.139472 

1.948668 

Adj.  to  2  S 

-.001240 

.112926 

.112927 

0.999593 

(Col.  (7)) 

.003570 

.072085 

.072097 

1.007321 

Adj.  to  1  S 

.044711 

.117207 

.119205 

1.055164 

(Col.  (9)) 

-.103800 

.057531 

.068305 

.954340 

*Column  numbers  refer  to  estimators  in  Table  1. 
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